AITopics | modular arithmetic

Collaborating Authors

modular arithmetic

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

986e0caad271b59417287737416d8594-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 01:33:16 GMT

artificial intelligence, machine learning, sequence, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.78)

Add feedback

SALSA: Attacking Lattice Cryptography with Transformers Emily Wenger University of Chicago Mingjie Chen University of Birmingham Francois Charton

Neural Information Processing SystemsNov-16-2025, 23:02:49 GMT

The looming threat of quantum computers has upended the field of cryptography.

machine learning, natural language, salsa, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.40)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Learning Modular Exponentiation with Transformers

Africa, David Demitri, Kapoor, Sara M., Sorg, Theo Simon, Mishra, Challenger

arXiv.org Artificial IntelligenceOct-24-2025

Modular exponentiation is crucial to number theory and cryptography, yet remains largely unexplored from a mechanistic interpretability standpoint. We train a 4-layer encoder-decoder Transformer model to perform this operation and investigate the emergence of numerical reasoning during training. Utilizing principled sampling strategies, PCA-based embedding analysis, and activation patching, we examine how number-theoretic properties are encoded within the model. We find that reciprocal operand training leads to strong performance gains, with sudden generalization across related moduli. These synchronized accuracy surges reflect grokking-like dynamics, suggesting the model internalizes shared arithmetic structure. We also find a subgraph consisting entirely of attention heads in the final layer sufficient to achieve full performance on the task of regular exponentiation. These results suggest that transformer models learn modular arithmetic through specialized computational circuits, paving the way for more interpretable and efficient neural approaches to modular exponentiation.

exponentiation, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2506.23679

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

SALSA: Attacking Lattice Cryptography with Transformers Emily Wenger University of Chicago Mingjie Chen University of Birmingham Francois Charton

Neural Information Processing SystemsAug-19-2025, 13:28:02 GMT

The looming threat of quantum computers has upended the field of cryptography.

machine learning, natural language, salsa, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.40)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Teaching Transformers Modular Arithmetic at Scale

Saxena, Eshika, Alfarano, Alberto, Wenger, Emily, Lauter, Kristin

arXiv.org Artificial IntelligenceOct-4-2024

Modular addition is, on its face, a simple operation: given $N$ elements in $\mathbb{Z}_q$, compute their sum modulo $q$. Yet, scalable machine learning solutions to this problem remain elusive: prior work trains ML models that sum $N \le 6$ elements mod $q \le 1000$. Promising applications of ML models for cryptanalysis-which often involve modular arithmetic with large $N$ and $q$-motivate reconsideration of this problem. This work proposes three changes to the modular addition model training pipeline: more diverse training data, an angular embedding, and a custom loss function. With these changes, we demonstrate success with our approach for $N = 256, q = 3329$, a case which is interesting for cryptographic applications, and a significant increase in $N$ and $q$ over prior work. These techniques also generalize to other modular arithmetic problems, motivating future work.

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.03569

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Government (0.46)
Information Technology > Security & Privacy (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product

Mallinar, Neil, Beaglehole, Daniel, Zhu, Libin, Radhakrishnan, Adityanarayanan, Pandit, Parthe, Belkin, Mikhail

arXiv.org Machine LearningJul-29-2024

Neural networks trained to solve modular arithmetic tasks exhibit grokking, a phenomenon where the test accuracy starts improving long after the model achieves 100% training accuracy in the training process. It is often taken as an example of "emergence", where model ability manifests sharply through a phase transition. In this work, we show that the phenomenon of grokking is not specific to neural networks nor to gradient descent-based optimization. Specifically, we show that this phenomenon occurs when learning modular arithmetic with Recursive Feature Machines (RFM), an iterative algorithm that uses the Average Gradient Outer Product (AGOP) to enable task-specific feature learning with general machine learning models. When used in conjunction with kernel machines, iterating RFM results in a fast transition from random, near zero, test accuracy to perfect test accuracy. This transition cannot be predicted from the training loss, which is identically zero, nor from the test loss, which remains constant in initial iterations. Instead, as we show, the transition is completely determined by feature learning: RFM gradually learns block-circulant features to solve modular arithmetic. Paralleling the results for RFM, we show that neural networks that solve modular arithmetic also learn block-circulant features. Furthermore, we present theoretical evidence that RFM uses such block-circulant features to implement the Fourier Multiplication Algorithm, which prior work posited as the generalizing solution neural networks learn on these tasks. Our results demonstrate that emergence can result purely from learning task-relevant features and is not specific to neural architectures nor gradient descent-based optimization methods. Furthermore, our work provides more evidence for AGOP as a key mechanism for feature learning in neural networks.

matrix, modular arithmetic, neural network, (15 more...)

arXiv.org Machine Learning

2407.20199

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Machine learning for modular multiplication

Lauter, Kristin, Li, Cathy Yuanchen, Maughan, Krystal, Newton, Rachel, Srivastava, Megha

arXiv.org Artificial IntelligenceFeb-29-2024

Motivated by cryptographic applications, we investigate two machine learning approaches to modular multiplication: namely circular regression and a sequence-to-sequence transformer model. The limited success of both methods demonstrated in our results gives evidence for the hardness of tasks involving modular multiplication upon which cryptosystems are based.

accuracy, machine learning, modular multiplication, (14 more...)

arXiv.org Artificial Intelligence

2402.19254

Country:

North America > United States > Vermont (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Neural Networks and the Chomsky Hierarchy

Delétang, Grégoire, Ruoss, Anian, Grau-Moya, Jordi, Genewein, Tim, Wenliang, Li Kevin, Catt, Elliot, Cundy, Chris, Hutter, Marcus, Legg, Shane, Veness, Joel, Ortega, Pedro A.

arXiv.org Artificial IntelligenceFeb-28-2023

Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (20'910 models, 15 tasks) to investigate whether insights from the theory of computation can predict the limits of neural network generalization in practice. We demonstrate that grouping tasks according to the Chomsky hierarchy allows us to forecast whether certain architectures will be able to generalize to out-of-distribution inputs. This includes negative results where even extensive amounts of data and training time never lead to any non-trivial generalization, despite models having sufficient capacity to fit the training data perfectly. Our results show that, for our subset of tasks, RNNs and Transformers fail to generalize on non-regular tasks, LSTMs can solve regular and counter-language tasks, and only networks augmented with structured memory (such as a stack or memory tape) can successfully generalize on context-free and context-sensitive tasks.

artificial intelligence, machine learning, sequence, (18 more...)

arXiv.org Artificial Intelligence

2207.02098

Country:

Africa > Mali (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Maryland (0.04)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback